Highly Heterogeneous XML Collections: How to Retrieve Precise Results?

نویسندگان

  • Ismael Sanz
  • Marco Mesiti
  • Giovanna Guerrini
  • Rafael Berlanga Llavori
چکیده

Highly heterogeneous XML collections are thematic collections exploiting different structures: the parent-child or ancestor-descendant relationships are not preserved and vocabulary discrepancies in the element names can occur. In this setting current approaches return answers with low precision. By means of similarity measures and semantic inverted indices we present an approach for improving the precision of query answers without compromising performance.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

خوشه‌بندی فراابتکاری اسناد فارسی اِکس‌اِم‌اِل مبتنی بر شباهت ساختاری و محتوایی

Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a do...

متن کامل

Information Retrieval Systems in XML Based Database – A review

XML the eXtensible Markup Language has emerged as a new standard for data representation and exchange over the Internet. It will become a universal format for data exchange on the Web and that in the near future we will find vast amounts of documents in XML format on the Web. As a result, it has become crucial to sort large collections of XML documents and retrieve relevant information from the...

متن کامل

Vague Content and Structure (VCAS) Retrieval for Document-centric XML Collections

Querying document-centric XML collections with structure conditions improves retrieval precisions. The structures of such XML collections, however, are often too complex for users to fully grasp. Thus, for queries regarding such collections, it is more appropriate to retrieve answers that approximately match the structure and content conditions in these queries, a process also known as vague co...

متن کامل

Approximate Tree Embedding for Querying XML Data

Querying heterogeneous collections of data-centric XML documents requires a combination of database languages and concepts used in information retrieval, in particular similarity search and ranking. In this paper we present an approach to find approximate answers to formal user queries. We reduce the problem of answering queries against XML document collections to the well-known unordered tree ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006